44 research outputs found
A comparison of homonym meaning frequency estimates derived from movie and television subtitles, free association, and explicit ratings
First Online: 10 September 2018Most words are ambiguous, with interpretation dependent on context. Advancing theories of ambiguity resolution is important for any general theory of language processing, and for resolving inconsistencies in observed ambiguity effects across experimental tasks. Focusing on homonyms (words such as bank with unrelated meanings EDGE OF A RIVER vs. FINANCIAL INSTITUTION), the present work advances theories and methods for estimating the relative frequency of their meanings, a factor that shapes observed ambiguity effects. We develop a new method for estimating meaning frequency based on the meaning of a homonym evoked in lines of movie and television subtitles according to human raters. We also replicate and extend a measure of meaning frequency derived from the classification of free associates. We evaluate the internal consistency of these measures, compare them to published estimates based on explicit ratings of each meaning’s frequency, and compare each set of norms in predicting performance in lexical and semantic decision mega-studies. All measures have high internal consistency and show agreement, but each is also associated with unique variance, which may be explained by integrating cognitive theories of memory with the demands of different experimental methodologies. To derive frequency estimates, we collected manual classifications of 533 homonyms over 50,000 lines of subtitles, and of 357 homonyms across over 5000 homonym–associate pairs. This database—publicly available at: www.blairarmstrong.net/homonymnorms/—constitutes a novel resource for computational cognitive modeling and computational linguistics, and we offer suggestions around good practices for its use in training and testing models on labeled data
Method For Making 2-Electron Response Reduced Density Matrices Approximately N-representable
In methods like geminal-based approaches or coupled cluster that are solved
using the projected Schr\"odinger equation, direct computation of the
2-electron reduced density matrix (2-RDM) is impractical and one falls back to
a 2-RDM based on response theory. However, the 2-RDMs from response theory are
not -representable. That is, the response 2-RDM does not correspond to an
actual physical -electron wave function. We present a new algorithm for
making these non--representable 2-RDMs approximately -representable, i.e.
it has the right symmetry and normalization and it fulfills the -, - and
-conditions. Next to an algorithm which can be applied to any 2-RDM, we have
also developed a 2-RDM optimization procedure specifically for seniority-zero
2-RDMs. We aim to find the 2-RDM with the right properties that is the closest
(in the sense of the Frobenius norm) to the non-N-representable 2-RDM by
minimizing the square norm of the difference between the initial 2-RDM and the
targeted 2-RDM under the constraint that the trace is normalized and the 2-RDM,
- and -matrices are positive semidefinite, i.e. their eigenvalues are
non-negative. Our method is suitable for fixing non-N-respresentable 2-RDMs
which are close to being N-representable. Through the N-representability
optimization algorithm we add a small correction to the initial 2-RDM such that
it fulfills the most important N-representability conditions.Comment: 13 pages, 8 figure
Snapshots of a molecular swivel in action
Members of the serine family of site-specific recombinases exchange DNA strands via 180° rotation about a central protein-protein interface. Modeling of this process has been hampered by the lack of structures in more than one rotational state for any individual serine recombinase. Here we report crystal structures of the catalytic domains of four constitutively active mutants of the serine recombinase Sin, providing snapshots of rotational states not previously visualized for Sin, including two seen in the same crystal. Normal mode analysis predicted that each tetramer's lowest frequency mode (i.e. most accessible large-scale motion) mimics rotation: two protomers rotate as a pair with respect to the other two. Our analyses also suggest that rotation is not a rigid body movement around a single symmetry axis but instead uses multiple pivot points and entails internal motions within each subunit
Inversion asymmetry effects in modulation-doped Cd1-xMnxTe quantum wells
We report a striking in-plane anisotropy of the spin-flip Raman signals observed for dilute magnetic Cd1−xMnxTe quantum wells containing a two-dimensional electron gas. The effect depends upon electron concentration, which can be varied within a single sample via secondary above-barrier illumination. The experimental results are described in a simple, single-electron picture by a model of the conduction band Hamiltonian that includes contributions from Dresselhaus, Rashba, and Zeeman terms
Genetic Diversity and Association Studies in US Hispanic/Latino Populations: Applications in the Hispanic Community Health Study/Study of Latinos
US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a “genetic-analysis group” variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness
Genome-wide Association Study of Platelet Count Identifies Ancestry-Specific Loci in Hispanic/Latino Americans
Platelets play an essential role in hemostasis and thrombosis. We performed a genome-wide association study of platelet count in 12,491 participants of the Hispanic Community Health Study/Study of Latinos by using a mixed-model method that accounts for admixture and family relationships. We discovered and replicated associations with five genes (ACTN1, ETV7, GABBR1-MOG, MEF2C, and ZBTB9-BAK1). Our strongest association was with Amerindian-specific variant rs117672662 (p value = 1.16 × 10−28) in ACTN1, a gene implicated in congenital macrothrombocytopenia. rs117672662 exhibited allelic differences in transcriptional activity and protein binding in hematopoietic cells. Our results underscore the value of diverse populations to extend insights into the allelic architecture of complex traits
Recommended from our members
Detectable Clonal Mosaicism from Birth to Old Age and its Relationship to Cancer
Clonal mosaicism for large chromosomal anomalies (duplications, deletions and uniparental disomy) was detected using SNP microarray data from over 50,000 subjects recruited for genome-wide association studies. This detection method requires a relatively high frequency of cells (>5–10%) with the same abnormal karyotype (presumably of clonal origin) in the presence of normal cells. The frequency of detectable clonal mosaicism in peripheral blood is low (<0.5%) from birth until 50 years of age, after which it rises rapidly to 2–3% in the elderly. Many of the mosaic anomalies are characteristic of those found in hematological cancers and identify common deleted regions that pinpoint the locations of genes previously associated with hematological cancers. Although only 3% of subjects with detectable clonal mosaicism had any record of hematological cancer prior to DNA sampling, those without a prior diagnosis have an estimated 10-fold higher risk of a subsequent hematological cancer (95% confidence interval = 6–18)
Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure
Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies